constraint equation
Counterfactual Explanations for k-means and Gaussian Clustering
Vardakas, Georgios, Karra, Antonia, Pitoura, Evaggelia, Likas, Aristidis
Counterfactuals have been recognized as an effective approach to explain classifier decisions. Nevertheless, they have not yet been considered in the context of clustering. In this work, we propose the use of counterfactuals to explain clustering solutions. First, we present a general definition for counterfactuals for model-based clustering that includes plausibility and feasibility constraints. Then we consider the counterfactual generation problem for k-means and Gaussian clustering assuming Euclidean distance. Our approach takes as input the factual, the target cluster, a binary mask indicating actionable or immutable features and a plausibility factor specifying how far from the cluster boundary the counterfactual should be placed. In the k-means clustering case, analytical mathematical formulas are presented for computing the optimal solution, while in the Gaussian clustering case (assuming full, diagonal, or spherical covariances) our method requires the numerical solution of a nonlinear equation with a single parameter only. We demonstrate the advantages of our approach through illustrative examples and quantitative experimental comparisons.
Increasing transformer token length with a Maximum Entropy Principle Method
Transformers suffer from the computational overhead of their quadratic dependence on the length of sequences processed. We present three methods, all adding an intermediate step between training and inference/generation, which extend the autoregressive length of transformers. All rely on a Maximum Entropy Principle (MEP) whereby entropy is maximized in the presence of suitable constraints, accounted for by use of Lagrange Multipliers. These constraint methods extend the autoregressive character from T to 2T tokens in a linear-with-T fashion. There is overhead associated with this added step, but they should still be faster than the standard methods.
Learning Constrained Dynamics with Gauss Principle adhering Gaussian Processes
Geist, A. Rene, Trimpe, Sebastian
The identification of the constrained dynamics of mechanical systems is often challenging. Learning methods promise to ease an analytical analysis, but require considerable amounts of data for training. We propose to combine insights from analytical mechanics with Gaussian process regression to improve the model's data efficiency and constraint integrity. The result is a Gaussian process model that incorporates a priori constraint knowledge such that its predictions adhere to Gauss' principle of least constraint. In return, predictions of the system's acceleration naturally respect potentially non-ideal (non-)holonomic equality constraints. As corollary results, our model enables to infer the acceleration of the unconstrained system from data of the constrained system and enables knowledge transfer between differing constraint configurations.
A Review of Machine Learning Applications in Fuzzing
Saavedra, Gary J, Rodhouse, Kathryn N, Dunlavy, Daniel M, Kegelmeyer, Philip W
Fuzzing has played an important role in improving software development and testing over the course of several decades. Recent research in fuzzing has focused on applications of machine learning (ML), offering useful tools to overcome challenges in the fuzzing process. This review surveys the current research in applying ML to fuzzing. Specifically, this review discusses successful applications of ML to fuzzing, briefly explores challenges encountered, and motivates future research to address fuzzing bottlenecks.
A Modified Construction for a Support Vector Classifier to Accommodate Class Imbalances
Given a training set with binary classification, the Support Vector Machine identifies the hyperplane maximizing the margin between the two classes of training data. This general formulation is useful in that it can be applied without regard to variance differences between the classes. Ignoring these differences is not optimal, however, as the general SVM will give the class with lower variance an unjustifiably wide berth. This increases the chance of misclassification of the other class and results in an overall loss of predictive performance. An alternate construction is proposed in which the margins of the separating hyperplane are different for each class, each proportional to the standard deviation of its class along the direction perpendicular to the hyperplane. The construction agrees with the SVM in the case of equal class variances. This paper will then examine the impact to the dual representation of the modified constraint equations.